shape and texture
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging due to the ambiguities of camera viewpoint, pose, texture, lighting, etc. We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild. Specifically, ARTIC3D is built upon a skeleton-based surface representation and is further guided by 2D diffusion priors from Stable Diffusion. First, we enhance the input images with occlusions/truncation via 2D diffusion to obtain cleaner mask estimates and semantic features. Second, we perform diffusion-guided 3D optimization to estimate shape and texture that are of high-fidelity and faithful to input images. We also propose a novel technique to calculate more stable image-level gradients via diffusion models compared to existing alternatives. Finally, we produce realistic animations by fine-tuning the rendered shape and texture under rigid part transformations. Extensive evaluations on multiple existing datasets as well as newly introduced noisy web image collections with occlusions and truncation demonstrate that ARTIC3D outputs are more robust to noisy images, higher quality in terms of shape and texture details, and more realistic when animated.
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process
Iwase, Shun, Takahashi, Shuya, Inoue, Nakamasa, Yokota, Rio, Nakamura, Ryo, Kataoka, Hirokatsu
The double descent phenomenon, which deviates from the traditional bias-variance trade-off theory, attracts considerable research attention; however, the mechanism of its occurrence is not fully understood. On the other hand, in the study of convolutional neural networks (CNNs) for image recognition, methods are proposed to quantify the bias on shape features versus texture features in images, determining which features the CNN focuses on more. In this work, we hypothesize that there is a relationship between the shape/texture bias in the learning process of CNNs and epoch-wise double descent, and we conduct verification. As a result, we discover double descent/ascent of shape/texture bias synchronized with double descent of test error under conditions where epoch-wise double descent is observed. Quantitative evaluations confirm this correlation between the test errors and the bias values from the initial decrease to the full increase in test error. Interestingly, double descent/ascent of shape/texture bias is observed in some cases even in conditions without label noise, where double descent is thought not to occur. These experimental results are considered to contribute to the understanding of the mechanisms behind the double descent phenomenon and the learning process of CNNs in image recognition.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
- North America > United States (0.14)
- North America > Canada > Ontario > Toronto (0.14)
ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging due to the ambiguities of camera viewpoint, pose, texture, lighting, etc. We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild. Specifically, ARTIC3D is built upon a skeleton-based surface representation and is further guided by 2D diffusion priors from Stable Diffusion. First, we enhance the input images with occlusions/truncation via 2D diffusion to obtain cleaner mask estimates and semantic features. Second, we perform diffusion-guided 3D optimization to estimate shape and texture that are of high-fidelity and faithful to input images.
ML Researchers Propose A Novel AI Model To Effectively Handle The Texture-Age Evolution
A lifespan face synthesis model aims to create a set of photo-realistic images that show what someone's whole life would look like, given just one picture as a reference. The generated image is expected to be age-sensitive with realistic transformations in shape and texture while maintaining their identity. This task is challenging because faces undergo separate but highly nonlinear changes when it comes to how they change due to ageing; for example, skin loses elasticity which can make them appear wrinkled or saggy more quickly than other parts of your body might start changing. The latest LFS (lifespan face synthesis) models are based on the new generative adversarial networks that use conditional transformations to allow people's age code to be seen. They have significantly benefitted from recent advancements of GANs, and they're still improving every day.
MeInGame: Create a Game Character Face from a Single Portrait
Lin, Jiangke, Yuan, Yi, Zou, Zhengxia
Many deep learning based 3D face reconstruction methods have been proposed recently, however, few of them have applications in games. Current game character customization systems either require players to manually adjust considerable face attributes to obtain the desired face, or have limited freedom of facial shape and texture. In this paper, we propose an automatic character face creation method that predicts both facial shape and texture from a single portrait, and it can be integrated into most existing 3D games. Although 3D Morphable Face Model (3DMM) based methods can restore accurate 3D faces from single images, the topology of 3DMM mesh is different from the meshes used in most games. To acquire fidelity texture, existing methods require a large amount of face texture data for training, while building such datasets is time-consuming and laborious. Besides, such a dataset collected under laboratory conditions may not generalized well to in-the-wild situations. To tackle these problems, we propose 1) a low-cost facial texture acquisition method, 2) a shape transfer algorithm that can transform the shape of a 3DMM mesh to games, and 3) a new pipeline for training 3D game face reconstruction networks. The proposed method not only can produce detailed and vivid game characters similar to the input portrait, but can also eliminate the influence of lighting and occlusions. Experiments show that our method outperforms state-of-the-art methods used in games.
- North America > United States > Michigan (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
Foundations Built for a General Theory of Neural Networks Quanta Magazine
When we design a skyscraper we expect it will perform to specification: that the tower will support so much weight and be able to withstand an earthquake of a certain strength. We play with different designs, tinker with different setups, but until we take it out for a test run, we don't really know what it can do or where it will fail. This technology is the neural network, which underpins today's most advanced artificial intelligence systems. Increasingly, neural networks are moving into the core areas of society: They determine what we learn of the world through our social media feeds, they help doctors diagnose illnesses, and they even influence whether a person convicted of a crime will spend time in jail. Yet "the best approximation to what we know is that we know almost nothing about how neural networks actually work and what a really insightful theory would be," said Boris Hanin, a mathematician at Texas A&M University and a visiting scientist at Facebook AI Research who studies neural networks.
- North America > United States > Texas (0.25)
- North America > United States > Pennsylvania (0.05)
- North America > United States > Oklahoma (0.05)
- North America > United States > Massachusetts (0.05)
DeSIGN: Design Inspiration from Generative Networks
Sbai, Othman, Elhoseiny, Mohamed, Bordes, Antoine, LeCun, Yann, Couprie, Camille
Can an algorithm create original and compelling fashion designs to serve as an inspirational assistant? To help answer this question, we design and investigate different image generation models associated with different loss functions to boost creativity in fashion generation. The dimensions of our explorations include: (i) different Generative Adversarial Networks architectures that start from noise vectors to generate fashion items, (ii) a new loss function that encourages creativity, and (iii) a generation process following the key elements of fashion design (disentangling shape and texture makers). A key challenge of this study is the evaluation of generated designs and the retrieval of best ones, hence we put together an evaluation protocol associating automatic metrics and human experimental studies that we hope will help ease future research. We show that our proposed creativity loss yields better overall appreciation than the one employed in Creative Adversarial Networks. In the end, about 61% of our images are thought to be created by human designers rather than by a computer while also being considered original per our human subject experiments, and our proposed loss scores the highest compared to existing losses in both novelty and likability.